skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Monga, Aarav"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. We present a database for automatic understanding of Social Engagement in MultiParty Interaction (SEMPI). Social engagement is an important social signal characterizing the level of participation of an interlocutor in a conversation. Social engagement involves maintaining attention and establishing connection and rapport. Machine understanding of social engagement can enable an autonomous agent to better understand the state of human participation and involvement to select optimal actions in human-machine social interaction. Recently, video-mediated interaction platforms, e.g., Zoom, have become very popular. The ease of use and increased accessibility of video calls have made them a preferred medium for multiparty conversations, including support groups and group therapy sessions. To create this dataset, we first collected a set of publicly available video calls posted on YouTube. We then segmented the videos by speech turn and cropped the videos to generate single-participant videos. We developed a questionnaire for assessing the level of social engagement by listeners in a conversation probing the relevant nonverbal behaviors for social engagement, including back-channeling, gaze, and expressions. We used Prolific, a crowd-sourcing platform, to annotate 3,505 videos of 76 listeners by three people, reaching a moderate to high inter-rater agreement of 0.693. This resulted in a database with aggregated engagement scores from the annotators. We developed a baseline multimodal pipeline using the state-of-the-art pre-trained models to track the level of engagement achieving the CCC score of 0.454. The results demonstrate the utility of the database for future applications in video-mediated human-machine interaction and human-human social skill assessment. Our dataset and code are available at https://github.com/ihp-lab/SEMPI. 
    more » « less
    Free, publicly-accessible full text available November 4, 2025